A Fast and High Throughput SQL Query System for Big Data

نویسندگان

  • Feng Zhu
  • Jie Liu
  • Lijie Xu
چکیده

Relational data query always plays an important role in data analysis. But how to scale out the traditional SQL query system is a challenging problem. In this paper, we introduce a fast, high throughput and scalable system to perform read-only SQL well with the advantage of NoSQL’s distributed architecture. We adopt HBase as the storage layer and design a distributed query engine (DQE) collaborating with it to perform SQL queries. Our system also contains distinctive index and cache mechanisms to accelerate query processing. Finally, we evaluate our system with real-world big data crawled from Sina Weibo and it achieves good performance under nineteen representative SQL queries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Data Management with Distributed Streaming SQL

To stay competitive in today’s data driven economy, enterprises large and small are turning to stream processing platforms to process high volume, high velocity, and diverse streams of data (fast data) as they arrive. Low-level programming models provided by the popular systems of today suffer from lack of responsiveness to change: enhancements require code changes with attendant large turn-aro...

متن کامل

FACTORBASE : SQL for Multi-Relational Model Learning

We describe FACTORBASE , a new framework that leverages a relational database management system (RDBMS) to support multi-relational graphical model learning. The basic insight behind our approach is that an RDBMS can be leveraged to manage not only big data, but also to manage big models [1, 2]: First, model structure and model parameters can be managed efficiently without having to be stored i...

متن کامل

FlashQueryFile: Flash-Optimized Layout and Algorithms for Interactive Ad Hoc SQL on Big Data

High performance storage layer is vital for allowing interactive ad hoc SQL analytics (OLAP style) over Big Data. The paper makes a case for leveraging flash in the Big Data stack to speed up queries. State-ofthe-art Big Data layouts and algorithms are optimized for hard disks (i.e., sequential access is emphasized over random access) and result in suboptimal performance on flash given its dras...

متن کامل

Intensional RDB Manifesto: a Unifying NewSQL Model for Flexible Big Data

In this paper we present a new family of Intensional RDBs (IRDBs) which extends the traditional RDBs with the Big Data and flexible and ’Open schema’ features, able to preserve the user-defined relational database schemas and all preexisting user’s applications containing the SQL statements for a deployment of such a relational data. The standard RDB data is parsed into an internal vector key/v...

متن کامل

انتخاب مناسب‌ترین زبان پرس‌وجو برای استفاده از فرا‌‌پیوندها جهت استخراج داده‌ها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES

Deductive Database systems are designed based on a logical data model. Data (as opposed to Relational Databases Management System (RDBMS) in which data stored in tables) are saved as facts in a Deductive Database system. Datalog Educational System (DES) is a Deductive Database system that Datalog mode is the default mode in this system. It can extract data to use outer joins with three query la...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012